feat(seer): Wire explorer chat write site through SeerRun outbox by trevor-e · Pull Request #115231 · getsentry/sentry

trevor-e · 2026-05-08T20:59:14Z

Summary

Adds optional run_type param to SeerAgentClient.start_run() — when provided and organizations:seer-run-mirror is on, routes through the outbox
Explorer chat endpoint now passes run_type=SeerRunType.EXPLORER
Other callers (trigger_autofix_agent, etc.) omit it, keeping the direct path unchanged

Stacked on #115111. Third write site conversion (after assisted-query and legacy autofix).

Test plan

New test test_outbox_path_creates_run_and_returns_run_id verifies SeerRun creation, outbox dispatch, and run_id return via the client
Existing test_post_new_conversation_calls_client updated to expect run_type=SeerRunType.EXPLORER
mypy + ruff clean

Register an outbox category and cell receiver to handle SeerRun creation via the hybrid cloud outbox system.

…flag Rename idempotency_key to external_idempotency_key in the SeerRun outbox receiver to match the field name Seer's request models expect. Register the organizations:seer-run-mirror feature flag for future write-site gating.

Wrap response.json() in the SeerRun outbox receiver in a JSONDecodeError guard. A 2xx response with a malformed body would otherwise raise uncaught and trap the outbox row in indefinite retry. Treat it as terminal, matching how 4xx is handled. Also declare external_idempotency_key on AgentChatRequest and SearchAgentStartRequest and cast the receiver bodies to those TypedDicts so the call signatures type-check.

The previous fix imported JSONDecodeError via sentry.utils.json (which re-exports simplejson). urllib3 BaseHTTPResponse.json() raises stdlib json.JSONDecodeError, an unrelated class. The except clause never matched, so a 2xx with malformed body would still propagate uncaught and trap the outbox row in indefinite retry. Switch to stdlib json with a noqa for the S003 rule.

Match the immediate-neighbor convention (.get() + try/except DoesNotExist) for the missing-row check, and place handle_seer_run_create at the end of cell.py so newest receivers append rather than prepend.

Match against SeerRunType(run.type) instead of the raw str field, then call assert_never on the default branch. mypy now flags any new SeerRunType variant that does not have a case in handle_seer_run_create at compile time.

Behind the organizations:seer-run-mirror flag, the search-agent endpoint now creates a SeerRun + CellOutbox row in a transaction. The receiver fires on commit (flush=True), makes the HTTPS call to Seer with run.uuid as external_idempotency_key, and fills in seer_run_state_id. Synchronous flush preserves the existing endpoint contract: the endpoint still returns run_id from the response. The other write sites (start_run, autofix, PR review, replay) follow in their own PRs.

Push the seer-run-mirror flag check inside send_search_agent_start_request so the endpoint has a single call site that returns run_id directly. Eliminates the parallel start_search_agent_via_outbox helper and the flag-aware branching in the endpoint, and inlines the body construction that previously lived in _build_search_agent_body.

…al handling Wrap payload extraction and SeerRunType parsing in a single try/except so malformed outbox rows mark the run FAILED instead of crashing the receiver and stalling the queue. Extract a small _mark_seer_run_failed helper shared by the three terminal-failure sites (invalid payload, 4xx response, 2xx with malformed JSON body).

The previous refactor that folded the flag dispatch into send_search_agent_start_request dropped the search_agent.missing_run_id error log along with its organization/project/response_data context. Restore it inside send_search_agent_start_request before the SeerApiError is raised.

…-facing form

Add a proper create_seer_run factory method to Factories/Fixtures so SeerRun test instances use the standard test helper pattern. Remove test_passes_idempotency_key which tested an implementation detail (single-line dict merge) already covered by the happy-path tests. Co-Authored-By: Claude <noreply@anthropic.com>

Co-Authored-By: Claude <noreply@anthropic.com>

… guard Two review fixes in the SeerRun outbox receiver: - PR_REVIEW raised NotImplementedError, which the outbox treats as transient and retries forever. Mark the run FAILED and return instead until PR_REVIEW dispatch is wired. - The idempotency early-return used truthiness on seer_run_state_id, which would re-issue the Seer request for the (legal) value 0. Compare against None explicitly.

A structurally valid 2xx Seer response that lacks a run_id field won't self-heal on retry — same terminal class as the malformed-JSON case immediately above. Mark the run FAILED and return instead of raising RuntimeError, which the outbox treats as transient and would retry indefinitely.

dict(viewer_context or {}) coerced None into an empty dict, which the receiver would then forward to _resolve_viewer_context as a non-None ViewerContext with null fields — triggering JWT signing instead of being skipped. Preserve None when the caller passes None so the downstream skip path stays intact for future write sites.

urllib3's BaseHTTPResponse.json() raises UnicodeDecodeError for non-UTF-8 bodies in addition to json.JSONDecodeError. Both are terminal: a non-UTF-8 binary body from a misbehaving proxy won't self-heal on retry. Catch both in the same except clause.

Remove the two terminal-case comments that just restated the function flow, and move _mark_seer_run_failed below handle_seer_run_create per the public-then-private convention.

Caller saw an opaque OutboxFlushError when the synchronous drain failed; the endpoint's existing SeerApiError handler is a better fit. Same translation pattern token_exchange/{manual_,}refresher.py uses for the same reason. The async outbox retry will heal the mirror state separately.

When the synchronous drain fails, the SeerRun row is already committed and the async outbox retry will eventually heal it. Surface the run uuid as a retry_token in the 500 response so future frontend logic can resume that same run instead of creating a duplicate via a fresh idempotency key. No client changes today; the field is forward-compatible.

If response.json() returns a list/scalar/null instead of an object, data.get('run_id') would raise AttributeError and stall the outbox shard on retries. Add an isinstance check inside the existing try so non-dict bodies route through the same invalid_json_body terminal path as undecodable bodies.

Add an optional `run_type` parameter to `SeerAgentClient.start_run()`. When provided and the `organizations:seer-run-mirror` flag is on, the method creates a SeerRun + CellOutbox entry instead of calling Seer directly. The explorer chat endpoint now passes `run_type=SeerRunType.EXPLORER`. Other callers (trigger_autofix_agent, etc.) omit it, keeping the direct path unchanged. Co-Authored-By: Claude <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 9e2060d. Configure here.}

cursor · 2026-05-08T21:03:26Z

+                        "body": dict(chat_body),
+                        "viewer_context": dict(self.viewer_context),
+                    },
+                ).save()


Missing OutboxFlushError handling in outbox path

High Severity

The outbox context with flush=True can raise OutboxFlushError if the signal receiver fails (e.g., Seer returns a 500, triggering a RuntimeError in handle_seer_run_create). This exception is not caught here, unlike the analogous code in search_agent_start.py which wraps the block in a try/except OutboxFlushError. Since OutboxFlushError does not inherit from SeerApiError, the endpoint's exception handler in organization_seer_agent_chat.py won't catch it either, resulting in an unhandled crash instead of a graceful error response.

^{Reviewed by Cursor Bugbot for commit 9e2060d. Configure here.}

cursor · 2026-05-08T21:03:26Z

+            run.refresh_from_db()
+            if run.mirror_status != SeerRunMirrorStatus.LIVE or run.seer_run_state_id is None:
+                raise SeerApiError("Seer run mirror failed to materialize", 500)
+            return run.seer_run_state_id


Outbox path skips explorer index triggering

Medium Severity

The outbox path returns early at line 380, completely bypassing _maybe_trigger_explorer_index_for_new_run. Since the explorer chat endpoint is the only caller passing run_type, all explorer runs routed through the outbox will never trigger project indexing or context-engine indexing, even when Seer's response indicates missing indexes. The response data containing has_explorer_index and has_org_project_context is consumed only in the receiver, which discards those fields.

Additional Locations (1)

src/sentry/seer/agent/client.py#L387-L394

^{Reviewed by Cursor Bugbot for commit 9e2060d. Configure here.}

trevor-e and others added 25 commits May 7, 2026 15:55

feat(seer): Add outbox receiver for SeerRun creation

05b2b37

Register an outbox category and cell receiver to handle SeerRun creation via the hybrid cloud outbox system.

ref(seer): Use .get() and move SeerRun receiver to end of cell.py

536814f

Match the immediate-neighbor convention (.get() + try/except DoesNotExist) for the missing-row check, and place handle_seer_run_create at the end of cell.py so newest receivers append rather than prepend.

ref(seer): Use assert_never for exhaustive SeerRunType match

a2e9612

Match against SeerRunType(run.type) instead of the raw str field, then call assert_never on the default branch. mypy now flags any new SeerRunType variant that does not have a case in handle_seer_run_create at compile time.

ref(seer): Drop unused shard_identifier from SeerRun receiver signature

cd7a1c1

ref(seer): Revert send_search_agent_start_request docstring to caller…

39386fa

…-facing form

ref(seer): Move SeerRunType import to module level in factories

3f0457d

Co-Authored-By: Claude <noreply@anthropic.com>

ref(seer): Make type an explicit kwarg in create_seer_run factory

5941453

Co-Authored-By: Claude <noreply@anthropic.com>

Merge branch 'master' into telkins/seer-run-outbox

d5f18bf

ref(seer): Drop verbose comments over skip cache functions

f4cfa2a

Remove the two terminal-case comments that just restated the function flow, and move _mark_seer_run_failed below handle_seer_run_create per the public-then-private convention.

github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label May 8, 2026

cursor Bot reviewed May 8, 2026

View reviewed changes

Base automatically changed from telkins/seer-run-outbox to master May 13, 2026 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(seer): Wire explorer chat write site through SeerRun outbox#115231

feat(seer): Wire explorer chat write site through SeerRun outbox#115231
trevor-e wants to merge 25 commits into
masterfrom
telkins/seer-run-outbox-explorer

trevor-e commented May 8, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 8, 2026

Uh oh!

cursor Bot May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

trevor-e commented May 8, 2026

Summary

Test plan

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 8, 2026

Choose a reason for hiding this comment

Missing OutboxFlushError handling in outbox path

Uh oh!

cursor Bot May 8, 2026

Choose a reason for hiding this comment

Outbox path skips explorer index triggering

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Missing `OutboxFlushError` handling in outbox path